High Utility Rare Itemset Mining over Transaction Databases
نویسندگان
چکیده
High-Utility Rare Itemset (HURI) mining finds itemsets from a database which have their utility no less than a given minimum utility threshold and have their support less than a given frequency threshold. Identifying high-utility rare itemsets from a database can help in better business decision making by highlighting the rare itemsets which give high profits so that they can be marketed more to earn good profit. Some two-phase algorithms have been proposed to mine high-utility rare itemsets. The rare itemsets are generated in the first phase and the highutility rare itemsets are extracted from rare itemsets in the second phase. However, a two-phase solution is inefficient as the number of rare itemsets is enormous as they increase at a very fast rate with the increase in the frequency threshold. In this paper, we propose an algorithm, namely UP-Rare Growth, which uses UP-Tree data structure to find high-utility rare itemsets from a transaction database. Instead of finding the rare itemsets explicitly, our proposed algorithm works on both frequency and utility of itemsets together. We also propose a couple of effective strategies to avoid searching the non-useful branches of the tree. Extensive experiments show that our proposed algorithm outperforms the state-ofthe-art algorithms in terms of number of candidates.
منابع مشابه
A Fuzzy Algorithm for Mining High Utility Rare Itemsets – FHURI
Classical frequent itemset mining identifies frequent itemsets in transaction databases using only frequency of item occurrences, without considering utility of items. In many real world situations, utility of itemsets are based upon user’s perspective such as cost, profit or revenue and are of significant importance. Utility mining considers using utility factors in data mining tasks. Utility-...
متن کاملHigh Utility Itemset Mining
Data Mining can be defined as an activity that extracts some new nontrivial information contained in large databases. Traditional data mining techniques have focused largely on detecting the statistical correlations between the items that are more frequent in the transaction databases. Also termed as frequent itemset mining , these techniques were based on the rationale that itemsets which appe...
متن کاملDiscovery of High Utility Itemsets Using Genetic Algorithm with Ranked Mutation
Utility mining is the study of itemset mining from the consideration of utilities. It is the utility-based itemset mining approach to find itemsets conforming to user preferences. Modern research in mining high-utility itemsets (HUI) from the databases faces two major challenges: exponential search space and database-dependent minimum utility threshold. The search space is extremely vast when t...
متن کاملA Survey on High Utility Itemset Mining Using Transaction Databases
Data Mining can be delineated as an action that analyze the data and draws out some new nontrivial information from the large amount of databases. Traditional data mining methods have focused on finding the statistical correlations between the items that are frequently appearing in the database. High utility itemset mining is an area of research where utility based mining is a descriptive type ...
متن کاملA Conceptual Approach to Temporal Weighted Itemset Utility Mining
Conventional Frequent pattern mining discovers patterns in transaction databases based only on the relative frequency of occurrence of items without considering their utility. Until recently, rarity has not received much attention in the context of data mining. For many real world applications, however, utility of itemsets based on cost, profit or revenue is of importance. Most Association Rule...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015